Interfacing of CASA and Multistream recognition
نویسندگان
چکیده
In this paper we propose a running demonstration of coupling between an intermediate processing step (named CASA), based on the harmonicity cue, and partial recognition, implemented with a HMM/ANN multistream technique 2]. The model is able to recognise words corrupted with narrow band noise, either stationary or having variable center frequency. The principle is to identify frame by frame the most noisy subband within four subbands by analysing a SNR-dependent representation. A static partial recogniser is fed with the remaining subbands. We establish on NUMBERS93 the noisy-band identiication (NBI) performance as well as the word error rate (WER), and alter the correlation between these two indexes by changing the distribution of the noise.
منابع مشابه
Interfacing of CASA and partial recognition based on a multistream technique
We propose a running demonstration of coupling between an intermediate processing step (named CASA), based on the harmonicity cue, and partial recognition, implemented with a HMM/ANN multistream technique [2]. The model is able to recognise words corrupted with narrow band noise, either stationary or having variable center frequency. The principle is to identify frame by frame the most noisy su...
متن کاملA New Snr-feature Mapping for Robust Multistream Speech Recognition
We describe a new model of CASA labelling which assigns to each time-frequency region a probability "clean" enough to feed a multistream recogniser only adapted to clean data. This labelling process is based on the harmonicity of the speech. The probability is evaluated according to a SNR-feature mapping and the choice of a SNR decision threshold. This allows an extension of a previous method [...
متن کاملA Measure of Speech and Pitch Reliability from Voicing
We propose a CASA labelling method of the TF representation, which is based on the periodicity of the speech, related to the voicing. A local voicing index is estimated in four subbands after demodulation of the signal. This index is used as a reliability measure for both pitch identification and speech recognition. First, this model allows robust f0 identification thanks to the voicing index, ...
متن کاملA CASA-labelling model using the localisation cue for robust cocktail-party speech recognition
We propose a new cocktail-party recognition technique based on the coupling of a CASA-labelling method using the TDOA (Time Delay Of Arrival) with multistream recognition. This is an alternative to the classical "segregate and recognise" architecture. First, we have recorded a stereo database ST-NB95 from the mono Numbers95. This is composed of binary mixtures of sentences at 0dB, placed left a...
متن کاملInterfacing Sound Stream Segregation to Automatic Speech Recognition - Preliminary Results on Listening to Several Sounds Simultaneously
This paper reports the preliminary results of experiments on listening to several sounds at once. Two issues are addressed: segregating speech streams from a mixture of sounds, and interfacing speech stream segregation with automatic speech recognition (ASR). Speech stream segregation (SSS) is modeled as a process of extracting harmonic fragments, grouping these extracted harmonic fragments, an...
متن کامل